Parallel FFT Implementations on Fixed-Point DSP-Cores with Subword-Parallelism
نویسندگان
چکیده
Fast Fourier transform algorithms are vital in many digital signalprocessing (DSP) applications. In here, both, radix-2 and radix-4 complex fast Fourier transform (FFT) implementations for fixed-point applications, using single instruction multiple data (SIMD) instructions and sub-word parallelism (SWP) is presented. It is shown that data management, and memory access are key to unleashing the arithmetic power of highly parallel digital signal processing (DSP) cores. The presented radix-2 implementation works for unconditioned data with length, N, that are a power of 2, but cannot fully utilize multiply-accumulate (MAC) units. In contrast, the discussed mixed-radix-4 implementation works for pre-conditioned data as found in orthogonal frequency division multiplexing (OFDM) and is customized to length N=256. This leads to near optimal MAC utilization on the TigerSHARCTM.
منابع مشابه
High-performance FFT implementation on the BOPS ManArray parallel DSP
We present a high performance implementation of the FFT algorithm on the BOPS ManArray parallel DSP processor. The ManArray we consider for this application consists of an array controller and 2 to 4 fully interconnected processing elements. To expose the parallelism inherent to an FFT algorithm we use a factorization of the DFT matrix in Kronecker products, permutation and diagonal matrices. O...
متن کاملUltra-Low-Energy DSP Processor Design for Many-Core Parallel Applications
Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...
متن کاملA Block Floating Point Implementation for an N-Point FFT on the TMS320C55x DSP
A block floating-point (BFP) implementation provides an innovative method of floating-point emulation on a fixed-point processor. This application report implements the BFP algorithm for the Fast Fourier Transform (FFT) algorithm on a Texas Instruments (TI) TMS320C55x DSP by taking advantage of the CPU exponent encoder. The BFP algorithm as it applies to the FFT allows signal gain adjustment i...
متن کاملSystematic Exploration of Trade-Offs between Application Throughput and Hardware Resource Requirements in DSP Systems
Title of dissertation: SYSTEMATIC EXPLORATION OF TRADE-OFFS BETWEEN APPLICATION THROUGHPUT AND HARDWARE RESOURCE REQUIREMENTS IN DSP SYSTEMS Hojin Kee, Doctor of Philosophy, 2010 Dissertation directed by: Shuvra S. Bhattacharyya, Professor Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies Dataflow has been used extensively as an efficient model-of-co...
متن کاملDevelopment of an FPGA-Based Two-Transform Pulse Compressor Mr. Skip
Recent advances in Field Programmable Gate Array (FPGA) technologies have resulted in high gate count and high performance FPGA parts which offer a cost-effective and short development cycle solution for computation intensive signal processor applications. These parts provide an attractive middle ground between Commercial Off-the-Shelf (COTS) boards employing Digital Signal Processor (DSP) chip...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005